English-Arabic Transliteration
نویسندگان
چکیده
Proper nouns may be considered as the most important query words in information retrieval. If the two languages use the same alphabet, the same proper nouns can be found in either language. However, if the two languages use different alphabets, the names must be transliterated. Short vowels are not usually marked on the Arabic words in almost all Arabic documents (except very important documents like the Muslim and Christian holy books). Moreover, most of Arabic words have a syllable of consonant-vowel (CV) which means that most of the Arabic words contain short or long vowel between two successive consonant letters. That makes it difficult to create EnglishArabic transliteration pairs since some English letters may not be matched with any Romanized Arabic letter. In the present study, we present different approaches for transliteration proper noun pair’s extraction from parallel corpora based on different similarity measures between the English and Romanized Arabic proper nouns under consideration. The strength of our new system is that it works well for low-frequency proper noun pairs. We evaluate the presented new approaches using two different EnglishArabic parallel corpora. Most of our results outperform previously published results in terms of precision, recall and FMeasure. Key-Words: Machine transliteration, Parallel corpora, Cross-language information retrieval.
منابع مشابه
Transliteration Experiments on Chinese and Arabic
We report the results of our transliteration experiments with language-specific adaptations in the context of two language pairs: English to Chinese, and Arabic to English. In particular, we investigate a syllable-based Pinyin intermediate representation for Chinese, and a letter mapping for Arabic.
متن کاملMachine Transliteration of Names in Arabic Text under Consideration for Other Conferences (specify)? None Machine Transliteration of Names in Arabic Text
We present a transliteration algorithm based on sound and spelling mappings using nite state machines. The transliteration models can be trained on relatively small lists of names. We introduce a new spelling-based model that much more accurate than state-of-the-art phonetic-based models and can be trained on easier-to-obtain training data. We apply our transliteration algorithm to the translit...
متن کاملDeveloping the Transliteration Interface for Arabic Text
In the Arabic-English and English-Arabic translation activities, the interface is very significant. For translation in the Arabic language, many issues need to be addressed. The existing systems have some problems and research has been initiated to improve. Transliteration is an important component of the translation. We in this study propose a system of interface for Arabic transliteration. Th...
متن کاملAutomatic Transliteration and Back-transliteration by Decision Tree Learning
Automatic transliteration and back-transliteration across languages with drastically different alphabets and phonemes inventories such as English/Korean, English/Japanese, English/Arabic, English/Chinese, etc, have practical importance in machine translation, crosslingual information retrieval, and automatic bilingual dictionary compilation, etc. In this paper, a bi-directional and to some exte...
متن کاملArabic to English Person Name Transliteration using Twitter
Social media outlets are providing new opportunities for harvesting valuable resources. We present a novel approach for mining data from Twitter for the purpose of building transliteration resources and systems. Such resources are crucial in translation and retrieval tasks. We demonstrate the benefits of the approach on Arabic to English transliteration. The contribution of this approach includ...
متن کامل